Overview

Dataset statistics

Number of variables40
Number of observations50000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory15.6 MiB
Average record size in memory328.0 B

Variable types

Numeric9
Categorical31

Alerts

count_floors_pre_eq is highly overall correlated with height_percentage and 1 other fieldsHigh correlation
height_percentage is highly overall correlated with count_floors_pre_eqHigh correlation
foundation_type is highly overall correlated with roof_type and 4 other fieldsHigh correlation
roof_type is highly overall correlated with foundation_type and 1 other fieldsHigh correlation
ground_floor_type is highly overall correlated with has_superstructure_mud_mortar_stone and 1 other fieldsHigh correlation
other_floor_type is highly overall correlated with count_floors_pre_eq and 1 other fieldsHigh correlation
has_superstructure_mud_mortar_stone is highly overall correlated with foundation_type and 1 other fieldsHigh correlation
has_superstructure_cement_mortar_brick is highly overall correlated with foundation_type and 1 other fieldsHigh correlation
has_superstructure_rc_non_engineered is highly overall correlated with foundation_typeHigh correlation
has_superstructure_rc_engineered is highly overall correlated with foundation_typeHigh correlation
has_secondary_use is highly overall correlated with has_secondary_use_agriculture and 1 other fieldsHigh correlation
has_secondary_use_agriculture is highly overall correlated with has_secondary_useHigh correlation
has_secondary_use_hotel is highly overall correlated with has_secondary_useHigh correlation
land_surface_condition is highly imbalanced (51.7%)Imbalance
foundation_type is highly imbalanced (61.0%)Imbalance
ground_floor_type is highly imbalanced (59.3%)Imbalance
position is highly imbalanced (50.5%)Imbalance
plan_configuration is highly imbalanced (90.9%)Imbalance
has_superstructure_adobe_mud is highly imbalanced (57.2%)Imbalance
has_superstructure_stone_flag is highly imbalanced (78.5%)Imbalance
has_superstructure_cement_mortar_stone is highly imbalanced (86.9%)Imbalance
has_superstructure_mud_mortar_brick is highly imbalanced (65.1%)Imbalance
has_superstructure_cement_mortar_brick is highly imbalanced (61.3%)Imbalance
has_superstructure_bamboo is highly imbalanced (57.4%)Imbalance
has_superstructure_rc_non_engineered is highly imbalanced (75.2%)Imbalance
has_superstructure_rc_engineered is highly imbalanced (87.6%)Imbalance
has_superstructure_other is highly imbalanced (88.8%)Imbalance
legal_ownership_status is highly imbalanced (86.0%)Imbalance
has_secondary_use_agriculture is highly imbalanced (64.9%)Imbalance
has_secondary_use_hotel is highly imbalanced (78.5%)Imbalance
has_secondary_use_rental is highly imbalanced (93.5%)Imbalance
has_secondary_use_institution is highly imbalanced (98.9%)Imbalance
has_secondary_use_school is highly imbalanced (99.5%)Imbalance
has_secondary_use_industry is highly imbalanced (98.6%)Imbalance
has_secondary_use_health_post is highly imbalanced (99.7%)Imbalance
has_secondary_use_gov_office is highly imbalanced (99.9%)Imbalance
has_secondary_use_use_police is highly imbalanced (99.8%)Imbalance
has_secondary_use_other is highly imbalanced (95.4%)Imbalance
building_id has unique valuesUnique
geo_level_1_id has 769 (1.5%) zerosZeros
age has 4964 (9.9%) zerosZeros
count_families has 3967 (7.9%) zerosZeros

Reproduction

Analysis started2023-05-13 12:44:21.975970
Analysis finished2023-05-13 12:44:32.177811
Duration10.2 seconds
Software versionydata-profiling vv4.1.2
Download configurationconfig.json

Variables

building_id
Real number (ℝ)

Distinct50000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean525877.64
Minimum4
Maximum1052934
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size781.2 KiB
2023-05-13T14:44:32.208218image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile52876.95
Q1260832.75
median526556.5
Q3788735
95-th percentile1000735.3
Maximum1052934
Range1052930
Interquartile range (IQR)527902.25

Descriptive statistics

Standard deviation304224.37
Coefficient of variation (CV)0.57850791
Kurtosis-1.2015965
Mean525877.64
Median Absolute Deviation (MAD)263981.5
Skewness0.0023315302
Sum2.6293882 × 1010
Variance9.2552468 × 1010
MonotonicityNot monotonic
2023-05-13T14:44:32.257123image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
802906 1
 
< 0.1%
148929 1
 
< 0.1%
806659 1
 
< 0.1%
176858 1
 
< 0.1%
711095 1
 
< 0.1%
345812 1
 
< 0.1%
684277 1
 
< 0.1%
610427 1
 
< 0.1%
570690 1
 
< 0.1%
576726 1
 
< 0.1%
Other values (49990) 49990
> 99.9%
ValueCountFrequency (%)
4 1
< 0.1%
16 1
< 0.1%
17 1
< 0.1%
31 1
< 0.1%
42 1
< 0.1%
65 1
< 0.1%
68 1
< 0.1%
118 1
< 0.1%
128 1
< 0.1%
151 1
< 0.1%
ValueCountFrequency (%)
1052934 1
< 0.1%
1052908 1
< 0.1%
1052903 1
< 0.1%
1052883 1
< 0.1%
1052880 1
< 0.1%
1052867 1
< 0.1%
1052855 1
< 0.1%
1052847 1
< 0.1%
1052797 1
< 0.1%
1052794 1
< 0.1%

geo_level_1_id
Real number (ℝ)

Distinct31
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.91538
Minimum0
Maximum30
Zeros769
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size781.2 KiB
2023-05-13T14:44:32.300419image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q17
median12
Q321
95-th percentile27
Maximum30
Range30
Interquartile range (IQR)14

Descriptive statistics

Standard deviation8.0080224
Coefficient of variation (CV)0.57547996
Kurtosis-1.2112282
Mean13.91538
Median Absolute Deviation (MAD)6
Skewness0.26578479
Sum695769
Variance64.128422
MonotonicityNot monotonic
2023-05-13T14:44:32.339384image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
6 4737
 
9.5%
17 4283
 
8.6%
26 4207
 
8.4%
10 4206
 
8.4%
7 3693
 
7.4%
8 3608
 
7.2%
20 3363
 
6.7%
21 2949
 
5.9%
4 2776
 
5.6%
27 2388
 
4.8%
Other values (21) 13790
27.6%
ValueCountFrequency (%)
0 769
 
1.5%
1 520
 
1.0%
2 186
 
0.4%
3 1364
 
2.7%
4 2776
5.6%
5 491
 
1.0%
6 4737
9.5%
7 3693
7.4%
8 3608
7.2%
9 747
 
1.5%
ValueCountFrequency (%)
30 507
 
1.0%
29 79
 
0.2%
28 48
 
0.1%
27 2388
4.8%
26 4207
8.4%
25 1118
 
2.2%
24 272
 
0.5%
23 207
 
0.4%
22 1181
 
2.4%
21 2949
5.9%

geo_level_2_id
Real number (ℝ)

Distinct1351
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean700.13604
Minimum0
Maximum1427
Zeros6
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size781.2 KiB
2023-05-13T14:44:32.383263image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile70
Q1350
median698
Q31049
95-th percentile1376
Maximum1427
Range1427
Interquartile range (IQR)699

Descriptive statistics

Standard deviation412.69801
Coefficient of variation (CV)0.58945403
Kurtosis-1.1883201
Mean700.13604
Median Absolute Deviation (MAD)350
Skewness0.03762659
Sum35006802
Variance170319.65
MonotonicityNot monotonic
2023-05-13T14:44:32.426747image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39 745
 
1.5%
158 477
 
1.0%
157 386
 
0.8%
363 380
 
0.8%
181 380
 
0.8%
1387 363
 
0.7%
673 360
 
0.7%
533 339
 
0.7%
463 334
 
0.7%
548 312
 
0.6%
Other values (1341) 45924
91.8%
ValueCountFrequency (%)
0 6
 
< 0.1%
1 37
0.1%
3 12
 
< 0.1%
4 59
0.1%
5 4
 
< 0.1%
6 1
 
< 0.1%
7 17
 
< 0.1%
8 24
 
< 0.1%
9 61
0.1%
10 78
0.2%
ValueCountFrequency (%)
1427 2
 
< 0.1%
1426 61
0.1%
1425 89
0.2%
1424 1
 
< 0.1%
1423 1
 
< 0.1%
1422 45
0.1%
1421 53
0.1%
1420 1
 
< 0.1%
1419 29
 
0.1%
1418 30
 
0.1%

geo_level_3_id
Real number (ℝ)

Distinct9215
Distinct (%)18.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6270.3066
Minimum3
Maximum12565
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size781.2 KiB
2023-05-13T14:44:32.474581image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile609
Q13083.75
median6276
Q39429
95-th percentile11945
Maximum12565
Range12562
Interquartile range (IQR)6345.25

Descriptive statistics

Standard deviation3653.061
Coefficient of variation (CV)0.58259687
Kurtosis-1.216275
Mean6270.3066
Median Absolute Deviation (MAD)3177
Skewness-0.00092974109
Sum3.1351533 × 108
Variance13344855
MonotonicityNot monotonic
2023-05-13T14:44:32.613945image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9133 116
 
0.2%
633 111
 
0.2%
621 99
 
0.2%
11440 97
 
0.2%
11246 90
 
0.2%
2005 85
 
0.2%
7723 84
 
0.2%
9229 64
 
0.1%
2452 62
 
0.1%
7868 58
 
0.1%
Other values (9205) 49134
98.3%
ValueCountFrequency (%)
3 1
 
< 0.1%
5 2
 
< 0.1%
6 1
 
< 0.1%
8 8
< 0.1%
11 7
< 0.1%
14 4
< 0.1%
15 8
< 0.1%
16 9
< 0.1%
17 3
 
< 0.1%
19 3
 
< 0.1%
ValueCountFrequency (%)
12565 1
 
< 0.1%
12564 3
 
< 0.1%
12563 4
 
< 0.1%
12561 4
 
< 0.1%
12560 3
 
< 0.1%
12559 1
 
< 0.1%
12557 10
< 0.1%
12556 3
 
< 0.1%
12555 3
 
< 0.1%
12553 5
< 0.1%

count_floors_pre_eq
Real number (ℝ)

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.12872
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size781.2 KiB
2023-05-13T14:44:32.651765image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.72795725
Coefficient of variation (CV)0.34196947
Kurtosis2.6076542
Mean2.12872
Median Absolute Deviation (MAD)0
Skewness0.86970281
Sum106436
Variance0.52992176
MonotonicityNot monotonic
2023-05-13T14:44:32.685026image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 30131
60.3%
3 10620
 
21.2%
1 7746
 
15.5%
4 1009
 
2.0%
5 449
 
0.9%
6 31
 
0.1%
7 12
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
1 7746
 
15.5%
2 30131
60.3%
3 10620
 
21.2%
4 1009
 
2.0%
5 449
 
0.9%
6 31
 
0.1%
7 12
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
9 1
 
< 0.1%
8 1
 
< 0.1%
7 12
 
< 0.1%
6 31
 
0.1%
5 449
 
0.9%
4 1009
 
2.0%
3 10620
 
21.2%
2 30131
60.3%
1 7746
 
15.5%

age
Real number (ℝ)

Distinct36
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.4032
Minimum0
Maximum995
Zeros4964
Zeros (%)9.9%
Negative0
Negative (%)0.0%
Memory size781.2 KiB
2023-05-13T14:44:32.726691image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110
median15
Q330
95-th percentile60
Maximum995
Range995
Interquartile range (IQR)20

Descriptive statistics

Standard deviation72.817382
Coefficient of variation (CV)2.7578999
Kurtosis160.48721
Mean26.4032
Median Absolute Deviation (MAD)10
Skewness12.310651
Sum1320160
Variance5302.3711
MonotonicityNot monotonic
2023-05-13T14:44:32.769143image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
10 7443
14.9%
15 6985
14.0%
5 6455
12.9%
20 6186
12.4%
0 4964
9.9%
25 4642
9.3%
30 3451
6.9%
35 2093
 
4.2%
40 2022
 
4.0%
50 1437
 
2.9%
Other values (26) 4322
8.6%
ValueCountFrequency (%)
0 4964
9.9%
5 6455
12.9%
10 7443
14.9%
15 6985
14.0%
20 6186
12.4%
25 4642
9.3%
30 3451
6.9%
35 2093
 
4.2%
40 2022
 
4.0%
45 889
 
1.8%
ValueCountFrequency (%)
995 261
0.5%
200 27
 
0.1%
190 1
 
< 0.1%
175 1
 
< 0.1%
160 1
 
< 0.1%
155 1
 
< 0.1%
150 20
 
< 0.1%
140 3
 
< 0.1%
135 1
 
< 0.1%
130 2
 
< 0.1%

area_percentage
Real number (ℝ)

Distinct69
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.01296
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size781.2 KiB
2023-05-13T14:44:32.814675image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median7
Q39
95-th percentile16
Maximum100
Range99
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.4057485
Coefficient of variation (CV)0.54982785
Kurtosis29.913623
Mean8.01296
Median Absolute Deviation (MAD)2
Skewness3.5361778
Sum400648
Variance19.41062
MonotonicityNot monotonic
2023-05-13T14:44:32.860417image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6 8191
16.4%
7 6937
13.9%
5 6263
12.5%
8 5475
10.9%
9 4267
8.5%
4 3662
7.3%
10 3037
 
6.1%
11 2676
 
5.4%
3 2317
 
4.6%
12 1410
 
2.8%
Other values (59) 5765
11.5%
ValueCountFrequency (%)
1 17
 
< 0.1%
2 598
 
1.2%
3 2317
 
4.6%
4 3662
7.3%
5 6263
12.5%
6 8191
16.4%
7 6937
13.9%
8 5475
10.9%
9 4267
8.5%
10 3037
 
6.1%
ValueCountFrequency (%)
100 1
 
< 0.1%
86 1
 
< 0.1%
85 1
 
< 0.1%
78 1
 
< 0.1%
75 1
 
< 0.1%
67 4
< 0.1%
66 3
< 0.1%
65 2
< 0.1%
64 1
 
< 0.1%
63 2
< 0.1%

height_percentage
Real number (ℝ)

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.43816
Minimum2
Maximum32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size781.2 KiB
2023-05-13T14:44:32.900774image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q14
median5
Q36
95-th percentile9
Maximum32
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.9425888
Coefficient of variation (CV)0.35721435
Kurtosis16.945686
Mean5.43816
Median Absolute Deviation (MAD)1
Skewness2.0178173
Sum271908
Variance3.7736513
MonotonicityNot monotonic
2023-05-13T14:44:32.935391image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
5 15181
30.4%
6 9001
18.0%
4 7154
14.3%
7 6734
13.5%
3 4895
 
9.8%
8 2648
 
5.3%
2 1848
 
3.7%
9 1023
 
2.0%
10 839
 
1.7%
11 191
 
0.4%
Other values (14) 486
 
1.0%
ValueCountFrequency (%)
2 1848
 
3.7%
3 4895
 
9.8%
4 7154
14.3%
5 15181
30.4%
6 9001
18.0%
7 6734
13.5%
8 2648
 
5.3%
9 1023
 
2.0%
10 839
 
1.7%
11 191
 
0.4%
ValueCountFrequency (%)
32 19
< 0.1%
26 1
 
< 0.1%
25 1
 
< 0.1%
23 2
 
< 0.1%
21 6
 
< 0.1%
20 6
 
< 0.1%
19 1
 
< 0.1%
18 11
 
< 0.1%
17 4
 
< 0.1%
16 36
0.1%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
t
41685 
n
6732 
o
 
1583

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowt
2nd rowo
3rd rowt
4th rowt
5th rowt

Common Values

ValueCountFrequency (%)
t 41685
83.4%
n 6732
 
13.5%
o 1583
 
3.2%

Length

2023-05-13T14:44:32.971489image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.011737image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
t 41685
83.4%
n 6732
 
13.5%
o 1583
 
3.2%

Most occurring characters

ValueCountFrequency (%)
t 41685
83.4%
n 6732
 
13.5%
o 1583
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 41685
83.4%
n 6732
 
13.5%
o 1583
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 41685
83.4%
n 6732
 
13.5%
o 1583
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 41685
83.4%
n 6732
 
13.5%
o 1583
 
3.2%

foundation_type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
r
42069 
w
 
2907
u
 
2727
i
 
2028
h
 
269

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowr
2nd rowr
3rd rowr
4th rowr
5th rowr

Common Values

ValueCountFrequency (%)
r 42069
84.1%
w 2907
 
5.8%
u 2727
 
5.5%
i 2028
 
4.1%
h 269
 
0.5%

Length

2023-05-13T14:44:33.041875image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.077993image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
r 42069
84.1%
w 2907
 
5.8%
u 2727
 
5.5%
i 2028
 
4.1%
h 269
 
0.5%

Most occurring characters

ValueCountFrequency (%)
r 42069
84.1%
w 2907
 
5.8%
u 2727
 
5.5%
i 2028
 
4.1%
h 269
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 42069
84.1%
w 2907
 
5.8%
u 2727
 
5.5%
i 2028
 
4.1%
h 269
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 42069
84.1%
w 2907
 
5.8%
u 2727
 
5.5%
i 2028
 
4.1%
h 269
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 42069
84.1%
w 2907
 
5.8%
u 2727
 
5.5%
i 2028
 
4.1%
h 269
 
0.5%

roof_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
n
35125 
q
11774 
x
 
3101

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rown
2nd rown
3rd rown
4th rown
5th rown

Common Values

ValueCountFrequency (%)
n 35125
70.2%
q 11774
 
23.5%
x 3101
 
6.2%

Length

2023-05-13T14:44:33.110313image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.144740image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
n 35125
70.2%
q 11774
 
23.5%
x 3101
 
6.2%

Most occurring characters

ValueCountFrequency (%)
n 35125
70.2%
q 11774
 
23.5%
x 3101
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 35125
70.2%
q 11774
 
23.5%
x 3101
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 35125
70.2%
q 11774
 
23.5%
x 3101
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 35125
70.2%
q 11774
 
23.5%
x 3101
 
6.2%

ground_floor_type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
f
40248 
v
4778 
x
4683 
z
 
197
m
 
94

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowf
2nd rowx
3rd rowf
4th rowf
5th rowf

Common Values

ValueCountFrequency (%)
f 40248
80.5%
v 4778
 
9.6%
x 4683
 
9.4%
z 197
 
0.4%
m 94
 
0.2%

Length

2023-05-13T14:44:33.176854image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.215971image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
f 40248
80.5%
v 4778
 
9.6%
x 4683
 
9.4%
z 197
 
0.4%
m 94
 
0.2%

Most occurring characters

ValueCountFrequency (%)
f 40248
80.5%
v 4778
 
9.6%
x 4683
 
9.4%
z 197
 
0.4%
m 94
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 40248
80.5%
v 4778
 
9.6%
x 4683
 
9.4%
z 197
 
0.4%
m 94
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 40248
80.5%
v 4778
 
9.6%
x 4683
 
9.4%
z 197
 
0.4%
m 94
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 40248
80.5%
v 4778
 
9.6%
x 4683
 
9.4%
z 197
 
0.4%
m 94
 
0.2%

other_floor_type
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
q
31698 
x
8311 
j
7642 
s
 
2349

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowq
2nd rowq
3rd rowx
4th rowx
5th rowx

Common Values

ValueCountFrequency (%)
q 31698
63.4%
x 8311
 
16.6%
j 7642
 
15.3%
s 2349
 
4.7%

Length

2023-05-13T14:44:33.250468image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.286769image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
q 31698
63.4%
x 8311
 
16.6%
j 7642
 
15.3%
s 2349
 
4.7%

Most occurring characters

ValueCountFrequency (%)
q 31698
63.4%
x 8311
 
16.6%
j 7642
 
15.3%
s 2349
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
q 31698
63.4%
x 8311
 
16.6%
j 7642
 
15.3%
s 2349
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
q 31698
63.4%
x 8311
 
16.6%
j 7642
 
15.3%
s 2349
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
q 31698
63.4%
x 8311
 
16.6%
j 7642
 
15.3%
s 2349
 
4.7%

position
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
s
38807 
t
8227 
j
 
2513
o
 
453

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowt
2nd rows
3rd rowt
4th rows
5th rows

Common Values

ValueCountFrequency (%)
s 38807
77.6%
t 8227
 
16.5%
j 2513
 
5.0%
o 453
 
0.9%

Length

2023-05-13T14:44:33.319277image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.359205image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
s 38807
77.6%
t 8227
 
16.5%
j 2513
 
5.0%
o 453
 
0.9%

Most occurring characters

ValueCountFrequency (%)
s 38807
77.6%
t 8227
 
16.5%
j 2513
 
5.0%
o 453
 
0.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 38807
77.6%
t 8227
 
16.5%
j 2513
 
5.0%
o 453
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 38807
77.6%
t 8227
 
16.5%
j 2513
 
5.0%
o 453
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 38807
77.6%
t 8227
 
16.5%
j 2513
 
5.0%
o 453
 
0.9%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
d
48041 
q
 
1077
u
 
671
c
 
55
s
 
52
Other values (5)
 
104

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowd
2nd rowd
3rd rowd
4th rowd
5th rowd

Common Values

ValueCountFrequency (%)
d 48041
96.1%
q 1077
 
2.2%
u 671
 
1.3%
c 55
 
0.1%
s 52
 
0.1%
a 44
 
0.1%
o 29
 
0.1%
n 13
 
< 0.1%
m 12
 
< 0.1%
f 6
 
< 0.1%

Length

2023-05-13T14:44:33.390444image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.433345image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
d 48041
96.1%
q 1077
 
2.2%
u 671
 
1.3%
c 55
 
0.1%
s 52
 
0.1%
a 44
 
0.1%
o 29
 
0.1%
n 13
 
< 0.1%
m 12
 
< 0.1%
f 6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
d 48041
96.1%
q 1077
 
2.2%
u 671
 
1.3%
c 55
 
0.1%
s 52
 
0.1%
a 44
 
0.1%
o 29
 
0.1%
n 13
 
< 0.1%
m 12
 
< 0.1%
f 6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 48041
96.1%
q 1077
 
2.2%
u 671
 
1.3%
c 55
 
0.1%
s 52
 
0.1%
a 44
 
0.1%
o 29
 
0.1%
n 13
 
< 0.1%
m 12
 
< 0.1%
f 6
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 48041
96.1%
q 1077
 
2.2%
u 671
 
1.3%
c 55
 
0.1%
s 52
 
0.1%
a 44
 
0.1%
o 29
 
0.1%
n 13
 
< 0.1%
m 12
 
< 0.1%
f 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 48041
96.1%
q 1077
 
2.2%
u 671
 
1.3%
c 55
 
0.1%
s 52
 
0.1%
a 44
 
0.1%
o 29
 
0.1%
n 13
 
< 0.1%
m 12
 
< 0.1%
f 6
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
45623 
1
 
4377

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 45623
91.2%
1 4377
 
8.8%

Length

2023-05-13T14:44:33.470349image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.503913image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 45623
91.2%
1 4377
 
8.8%

Most occurring characters

ValueCountFrequency (%)
0 45623
91.2%
1 4377
 
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 45623
91.2%
1 4377
 
8.8%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 45623
91.2%
1 4377
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 45623
91.2%
1 4377
 
8.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
1
38203 
0
11797 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 38203
76.4%
0 11797
 
23.6%

Length

2023-05-13T14:44:33.532141image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.565333image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
1 38203
76.4%
0 11797
 
23.6%

Most occurring characters

ValueCountFrequency (%)
1 38203
76.4%
0 11797
 
23.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 38203
76.4%
0 11797
 
23.6%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 38203
76.4%
0 11797
 
23.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 38203
76.4%
0 11797
 
23.6%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
48288 
1
 
1712

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 48288
96.6%
1 1712
 
3.4%

Length

2023-05-13T14:44:33.594586image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.627362image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 48288
96.6%
1 1712
 
3.4%

Most occurring characters

ValueCountFrequency (%)
0 48288
96.6%
1 1712
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 48288
96.6%
1 1712
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 48288
96.6%
1 1712
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 48288
96.6%
1 1712
 
3.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49090 
1
 
910

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49090
98.2%
1 910
 
1.8%

Length

2023-05-13T14:44:33.656466image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.689273image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49090
98.2%
1 910
 
1.8%

Most occurring characters

ValueCountFrequency (%)
0 49090
98.2%
1 910
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49090
98.2%
1 910
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49090
98.2%
1 910
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49090
98.2%
1 910
 
1.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
46725 
1
 
3275

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 46725
93.5%
1 3275
 
6.6%

Length

2023-05-13T14:44:33.717229image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.751373image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 46725
93.5%
1 3275
 
6.6%

Most occurring characters

ValueCountFrequency (%)
0 46725
93.5%
1 3275
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 46725
93.5%
1 3275
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 46725
93.5%
1 3275
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 46725
93.5%
1 3275
 
6.6%

has_superstructure_cement_mortar_brick
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
46209 
1
 
3791

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 46209
92.4%
1 3791
 
7.6%

Length

2023-05-13T14:44:33.778864image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.812616image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 46209
92.4%
1 3791
 
7.6%

Most occurring characters

ValueCountFrequency (%)
0 46209
92.4%
1 3791
 
7.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 46209
92.4%
1 3791
 
7.6%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 46209
92.4%
1 3791
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 46209
92.4%
1 3791
 
7.6%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
37245 
1
12755 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 37245
74.5%
1 12755
 
25.5%

Length

2023-05-13T14:44:33.841974image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.876153image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 37245
74.5%
1 12755
 
25.5%

Most occurring characters

ValueCountFrequency (%)
0 37245
74.5%
1 12755
 
25.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 37245
74.5%
1 12755
 
25.5%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 37245
74.5%
1 12755
 
25.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 37245
74.5%
1 12755
 
25.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
45658 
1
 
4342

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 45658
91.3%
1 4342
 
8.7%

Length

2023-05-13T14:44:33.905363image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:33.939091image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 45658
91.3%
1 4342
 
8.7%

Most occurring characters

ValueCountFrequency (%)
0 45658
91.3%
1 4342
 
8.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 45658
91.3%
1 4342
 
8.7%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 45658
91.3%
1 4342
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 45658
91.3%
1 4342
 
8.7%

has_superstructure_rc_non_engineered
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
47935 
1
 
2065

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 47935
95.9%
1 2065
 
4.1%

Length

2023-05-13T14:44:33.967344image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.001804image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 47935
95.9%
1 2065
 
4.1%

Most occurring characters

ValueCountFrequency (%)
0 47935
95.9%
1 2065
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 47935
95.9%
1 2065
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 47935
95.9%
1 2065
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 47935
95.9%
1 2065
 
4.1%

has_superstructure_rc_engineered
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49153 
1
 
847

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49153
98.3%
1 847
 
1.7%

Length

2023-05-13T14:44:34.030517image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.064825image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49153
98.3%
1 847
 
1.7%

Most occurring characters

ValueCountFrequency (%)
0 49153
98.3%
1 847
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49153
98.3%
1 847
 
1.7%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49153
98.3%
1 847
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49153
98.3%
1 847
 
1.7%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49249 
1
 
751

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49249
98.5%
1 751
 
1.5%

Length

2023-05-13T14:44:34.094067image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.128174image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49249
98.5%
1 751
 
1.5%

Most occurring characters

ValueCountFrequency (%)
0 49249
98.5%
1 751
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49249
98.5%
1 751
 
1.5%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49249
98.5%
1 751
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49249
98.5%
1 751
 
1.5%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
v
48143 
a
 
1070
w
 
505
r
 
282

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowv
2nd rowv
3rd rowv
4th rowv
5th rowv

Common Values

ValueCountFrequency (%)
v 48143
96.3%
a 1070
 
2.1%
w 505
 
1.0%
r 282
 
0.6%

Length

2023-05-13T14:44:34.157150image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.192139image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
v 48143
96.3%
a 1070
 
2.1%
w 505
 
1.0%
r 282
 
0.6%

Most occurring characters

ValueCountFrequency (%)
v 48143
96.3%
a 1070
 
2.1%
w 505
 
1.0%
r 282
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 50000
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
v 48143
96.3%
a 1070
 
2.1%
w 505
 
1.0%
r 282
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 50000
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
v 48143
96.3%
a 1070
 
2.1%
w 505
 
1.0%
r 282
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
v 48143
96.3%
a 1070
 
2.1%
w 505
 
1.0%
r 282
 
0.6%

count_families
Real number (ℝ)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.98384
Minimum0
Maximum6
Zeros3967
Zeros (%)7.9%
Negative0
Negative (%)0.0%
Memory size781.2 KiB
2023-05-13T14:44:34.218800image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.41301606
Coefficient of variation (CV)0.41980003
Kurtosis14.810921
Mean0.98384
Median Absolute Deviation (MAD)0
Skewness1.4466337
Sum49192
Variance0.17058227
MonotonicityNot monotonic
2023-05-13T14:44:34.247773image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 43417
86.8%
0 3967
 
7.9%
2 2200
 
4.4%
3 323
 
0.6%
4 63
 
0.1%
5 26
 
0.1%
6 4
 
< 0.1%
ValueCountFrequency (%)
0 3967
 
7.9%
1 43417
86.8%
2 2200
 
4.4%
3 323
 
0.6%
4 63
 
0.1%
5 26
 
0.1%
6 4
 
< 0.1%
ValueCountFrequency (%)
6 4
 
< 0.1%
5 26
 
0.1%
4 63
 
0.1%
3 323
 
0.6%
2 2200
 
4.4%
1 43417
86.8%
0 3967
 
7.9%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
44294 
1
5706 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 44294
88.6%
1 5706
 
11.4%

Length

2023-05-13T14:44:34.281583image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.315085image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 44294
88.6%
1 5706
 
11.4%

Most occurring characters

ValueCountFrequency (%)
0 44294
88.6%
1 5706
 
11.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 44294
88.6%
1 5706
 
11.4%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 44294
88.6%
1 5706
 
11.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 44294
88.6%
1 5706
 
11.4%

has_secondary_use_agriculture
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
46697 
1
 
3303

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 46697
93.4%
1 3303
 
6.6%

Length

2023-05-13T14:44:34.345390image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.378352image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 46697
93.4%
1 3303
 
6.6%

Most occurring characters

ValueCountFrequency (%)
0 46697
93.4%
1 3303
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 46697
93.4%
1 3303
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 46697
93.4%
1 3303
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 46697
93.4%
1 3303
 
6.6%

has_secondary_use_hotel
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
48287 
1
 
1713

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 48287
96.6%
1 1713
 
3.4%

Length

2023-05-13T14:44:34.407423image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.440404image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 48287
96.6%
1 1713
 
3.4%

Most occurring characters

ValueCountFrequency (%)
0 48287
96.6%
1 1713
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 48287
96.6%
1 1713
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 48287
96.6%
1 1713
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 48287
96.6%
1 1713
 
3.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49617 
1
 
383

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49617
99.2%
1 383
 
0.8%

Length

2023-05-13T14:44:34.467804image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.502680image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49617
99.2%
1 383
 
0.8%

Most occurring characters

ValueCountFrequency (%)
0 49617
99.2%
1 383
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49617
99.2%
1 383
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49617
99.2%
1 383
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49617
99.2%
1 383
 
0.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49953 
1
 
47

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49953
99.9%
1 47
 
0.1%

Length

2023-05-13T14:44:34.531110image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.564569image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49953
99.9%
1 47
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 49953
99.9%
1 47
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49953
99.9%
1 47
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49953
99.9%
1 47
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49953
99.9%
1 47
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49981 
1
 
19

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49981
> 99.9%
1 19
 
< 0.1%

Length

2023-05-13T14:44:34.593283image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.626289image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49981
> 99.9%
1 19
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 49981
> 99.9%
1 19
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49981
> 99.9%
1 19
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49981
> 99.9%
1 19
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49981
> 99.9%
1 19
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49938 
1
 
62

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49938
99.9%
1 62
 
0.1%

Length

2023-05-13T14:44:34.653444image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.783365image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49938
99.9%
1 62
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 49938
99.9%
1 62
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49938
99.9%
1 62
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49938
99.9%
1 62
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49938
99.9%
1 62
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49987 
1
 
13

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49987
> 99.9%
1 13
 
< 0.1%

Length

2023-05-13T14:44:34.811381image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.845447image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49987
> 99.9%
1 13
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 49987
> 99.9%
1 13
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49987
> 99.9%
1 13
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49987
> 99.9%
1 13
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49987
> 99.9%
1 13
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49997 
1
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49997
> 99.9%
1 3
 
< 0.1%

Length

2023-05-13T14:44:34.873222image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.906307image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49997
> 99.9%
1 3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 49997
> 99.9%
1 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49997
> 99.9%
1 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49997
> 99.9%
1 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49997
> 99.9%
1 3
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49992 
1
 
8

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49992
> 99.9%
1 8
 
< 0.1%

Length

2023-05-13T14:44:34.935243image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:34.969225image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49992
> 99.9%
1 8
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 49992
> 99.9%
1 8
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49992
> 99.9%
1 8
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49992
> 99.9%
1 8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49992
> 99.9%
1 8
 
< 0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
49746 
1
 
254

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 49746
99.5%
1 254
 
0.5%

Length

2023-05-13T14:44:34.996878image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:35.030992image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
0 49746
99.5%
1 254
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 49746
99.5%
1 254
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 49746
99.5%
1 254
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 49746
99.5%
1 254
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 49746
99.5%
1 254
 
0.5%

damage_grade
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
2
28366 
3
16826 
1
4808 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters50000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row2
3rd row3
4th row2
5th row3

Common Values

ValueCountFrequency (%)
2 28366
56.7%
3 16826
33.7%
1 4808
 
9.6%

Length

2023-05-13T14:44:35.059445image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-05-13T14:44:35.096122image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
2 28366
56.7%
3 16826
33.7%
1 4808
 
9.6%

Most occurring characters

ValueCountFrequency (%)
2 28366
56.7%
3 16826
33.7%
1 4808
 
9.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 28366
56.7%
3 16826
33.7%
1 4808
 
9.6%

Most occurring scripts

ValueCountFrequency (%)
Common 50000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 28366
56.7%
3 16826
33.7%
1 4808
 
9.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 28366
56.7%
3 16826
33.7%
1 4808
 
9.6%

Interactions

2023-05-13T14:44:31.159855image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:27.668643image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.229758image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.640412image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.056602image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.465051image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.886964image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.280195image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.766481image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.204595image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:27.769557image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.278326image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.685515image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.101163image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.512847image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.930406image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.415179image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.808849image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.247539image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:27.834960image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.323914image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.731096image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.148251image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.558104image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.973512image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.459092image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.852366image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.292447image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:27.905616image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.373316image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.778342image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.195328image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.607717image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.019838image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.504077image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.897266image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.337384image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:27.964969image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.416025image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.823642image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.240391image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.654870image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.064301image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.548223image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.941843image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.382961image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.015735image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.463797image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.875677image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.287573image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.704604image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.112230image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.594278image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.987428image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.426302image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.063550image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.506964image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.920019image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.332861image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.749175image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.152341image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.637452image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.030885image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.468757image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.126513image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.551185image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.964981image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.376329image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.795902image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.194604image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.679307image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.073823image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.513054image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.179864image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:28.594595image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.010158image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.421281image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:29.840157image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.236433image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:30.721777image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2023-05-13T14:44:31.116065image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2023-05-13T14:44:35.146430image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentagecount_familiesland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statushas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
building_id1.000-0.007-0.001-0.0050.0060.003-0.0020.010-0.0030.0000.0070.0000.0000.0060.0000.0000.0000.0000.0050.0130.0050.0080.0000.0000.0130.0000.0090.0030.0000.0040.0000.0050.0000.0050.0000.0020.0000.0010.0000.013
geo_level_1_id-0.0071.000-0.0660.008-0.088-0.0640.035-0.0770.0350.0320.1960.2030.1080.1260.1340.0300.2690.3300.1040.0620.2720.2080.2080.2350.0420.0600.0560.0850.1050.1330.0290.0390.0000.0000.0090.0170.0000.0000.0320.269
geo_level_2_id-0.001-0.0661.000-0.0020.0370.033-0.0200.031-0.0150.0480.0980.0840.0670.0680.0780.0160.0810.1570.0570.0270.0970.1250.0890.0920.0590.0660.0410.0650.0240.0320.0360.0340.0090.0160.0000.0000.0000.0000.0250.068
geo_level_3_id-0.0050.008-0.0021.000-0.0160.0050.003-0.021-0.0000.0280.0330.0360.0230.0300.0270.0090.0410.0330.0320.0200.0550.0330.0540.0370.0300.0430.0250.0290.0000.0260.0190.0110.0070.0040.0090.0000.0000.0190.0160.021
count_floors_pre_eq0.006-0.0880.037-0.0161.0000.2590.1260.7540.0750.0510.1470.1860.1260.5830.3160.0570.1970.3540.0500.0390.3860.2570.0970.0850.1050.1510.0330.0660.0820.0540.1530.0640.0440.0090.0430.0270.0330.0180.0200.151
age0.003-0.0640.0330.0050.2591.000-0.0210.1940.0520.0170.0410.0110.0350.0210.1130.0080.1040.0690.0290.0000.1470.0000.0170.0120.0090.0090.0000.0090.0080.0140.0000.0000.0000.0000.0000.0000.0000.0000.0090.020
area_percentage-0.0020.035-0.0200.0030.126-0.0211.0000.2100.0700.0210.1690.2550.1700.1880.0510.0560.0350.2450.0040.0740.0620.2160.0540.0350.1920.2270.0040.0000.1180.0160.1630.0980.0750.0690.0090.0130.0200.0000.0230.105
height_percentage0.010-0.0770.031-0.0210.7540.1940.2101.0000.0560.0320.1710.2430.1240.3050.2180.0420.1520.2610.0200.0430.2500.1460.0620.0610.1680.2580.0100.0360.1350.0470.2070.1090.0660.0230.0460.0590.0620.0160.0480.097
count_families-0.0030.035-0.015-0.0000.0750.0520.0700.0561.0000.0120.0570.0810.0480.0710.0430.0000.0360.0650.0060.0190.0420.0480.0390.0320.0530.0710.0070.0170.1120.0460.0810.0860.0420.0180.0160.0090.0720.0160.0360.063
land_surface_condition0.0000.0320.0480.0280.0510.0170.0210.0320.0121.0000.0310.0370.0450.0370.0330.0220.0180.0780.0520.0060.0630.0540.0470.0320.0060.0220.0360.0170.0120.0000.0150.0050.0050.0000.0040.0040.0000.0000.0190.024
foundation_type0.0070.1960.0980.0330.1470.0410.1690.1710.0570.0311.0000.5490.3590.4100.1040.0540.1060.5560.1460.2070.0750.5140.3430.3080.5050.5480.1100.1460.1760.0610.2620.1990.0730.0400.0220.0220.0370.0200.0200.310
roof_type0.0000.2030.0840.0360.1860.0110.2550.2430.0810.0370.5491.0000.4750.5210.1360.0540.0720.4370.0420.0870.0370.4170.1410.0930.4480.4820.0240.0280.1680.0630.2460.1950.0730.0240.0210.0260.0290.0170.0160.244
ground_floor_type0.0000.1080.0670.0230.1260.0350.1700.1240.0480.0450.3590.4751.0000.3670.0860.0560.0820.5060.1310.1570.0480.5890.1010.0830.3680.3760.0370.0300.1580.0740.2600.1620.0590.0490.0380.0230.0220.0090.0410.264
other_floor_type0.0060.1260.0680.0300.5830.0210.1880.3050.0710.0370.4100.5210.3671.0000.1190.0550.0870.4510.1250.1010.0370.4430.1640.0700.3860.4300.0400.0650.1860.0680.2710.1930.0830.0450.0280.0310.0340.0000.0240.248
position0.0000.1340.0780.0270.3160.1130.0510.2180.0430.0330.1040.1360.0860.1191.0000.0260.1900.2800.0120.0410.3440.1270.0590.0600.1020.1020.0000.0300.1280.0320.2140.0660.0100.0060.0270.0070.0000.0000.0140.048
plan_configuration0.0000.0300.0160.0090.0570.0080.0560.0420.0000.0220.0540.0540.0560.0550.0261.0000.0260.1080.0150.0200.0490.0970.0280.0130.0340.0410.0220.0160.0120.0180.0330.0250.0000.0270.0000.0000.0000.0480.0000.057
has_superstructure_adobe_mud0.0000.2690.0810.0410.1970.1040.0350.1520.0360.0180.1060.0720.0820.0870.1900.0261.0000.3070.0130.0110.3060.0320.0090.0020.0370.0370.0440.0530.0140.0040.0130.0000.0040.0000.0000.0000.0000.0000.0050.074
has_superstructure_mud_mortar_stone0.0000.3300.1570.0330.3540.0690.2450.2610.0650.0780.5560.4370.5060.4510.2800.1080.3071.0000.0320.1100.3610.4740.0370.0500.2270.2320.0320.1440.0820.0650.1620.1150.0330.0290.0290.0060.0100.0000.0000.336
has_superstructure_stone_flag0.0050.1040.0570.0320.0500.0290.0040.0200.0060.0520.1460.0420.1310.1250.0120.0150.0130.0321.0000.0370.0310.0410.1310.0740.0070.0240.0630.0070.0000.0100.0120.0140.0000.0000.0020.0000.0000.0000.0080.062
has_superstructure_cement_mortar_stone0.0130.0620.0270.0200.0390.0000.0740.0430.0190.0060.2070.0870.1570.1010.0410.0200.0110.1100.0371.0000.0000.0830.0190.0000.0870.0310.0010.0180.0450.0140.0720.0380.0170.0080.0000.0000.0000.0000.0140.062
has_superstructure_mud_mortar_brick0.0050.2720.0970.0550.3860.1470.0620.2500.0420.0630.0750.0370.0480.0370.3440.0490.3060.3610.0310.0001.0000.0270.0000.0000.0300.0220.0180.0410.0090.0400.0290.0100.0000.0000.0140.0000.0000.0000.0000.064
has_superstructure_cement_mortar_brick0.0080.2080.1250.0330.2570.0000.2160.1460.0480.0540.5140.4170.5890.4430.1270.0970.0320.4740.0410.0830.0271.0000.0600.0530.1420.1280.0060.0750.0700.0580.1390.0980.0240.0150.0290.0000.0000.0030.0050.278
has_superstructure_timber0.0000.2080.0890.0540.0970.0170.0540.0620.0390.0470.3430.1410.1010.1640.0590.0280.0090.0370.1310.0190.0000.0601.0000.4460.0270.0700.1080.1050.0200.0050.0300.0270.0000.0000.0000.0070.0000.0000.0050.073
has_superstructure_bamboo0.0000.2350.0920.0370.0850.0120.0350.0610.0320.0320.3080.0930.0830.0700.0600.0130.0020.0500.0740.0000.0000.0530.4461.0000.0120.0390.1190.0860.0230.0020.0350.0190.0000.0000.0000.0000.0000.0000.0010.069
has_superstructure_rc_non_engineered0.0130.0420.0590.0300.1050.0090.1920.1680.0530.0060.5050.4480.3680.3860.1020.0340.0370.2270.0070.0870.0300.1420.0270.0121.0000.0080.0180.0070.1190.0230.1690.1150.0480.0190.0070.0040.0020.0000.0030.192
has_superstructure_rc_engineered0.0000.0600.0660.0430.1510.0090.2270.2580.0710.0220.5480.4820.3760.4300.1020.0410.0370.2320.0240.0310.0220.1280.0700.0390.0081.0000.0090.0080.1080.0310.1490.1300.0390.0170.0190.0110.0290.0160.0080.237
has_superstructure_other0.0090.0560.0410.0250.0330.0000.0040.0100.0070.0360.1100.0240.0370.0400.0000.0220.0440.0320.0630.0010.0180.0060.1080.1190.0180.0091.0000.0080.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0040.035
legal_ownership_status0.0030.0850.0650.0290.0660.0090.0000.0360.0170.0170.1460.0280.0300.0650.0300.0160.0530.1440.0070.0180.0410.0750.1050.0860.0070.0080.0081.0000.0280.0160.0530.0060.0000.0170.0000.0000.0000.0000.0180.067
has_secondary_use0.0000.1050.0240.0000.0820.0080.1180.1350.1120.0120.1760.1680.1580.1860.1280.0120.0140.0820.0000.0450.0090.0700.0200.0230.1190.1080.0000.0281.0000.7410.5250.2440.0840.0530.0970.0430.0170.0320.1990.072
has_secondary_use_agriculture0.0040.1330.0320.0260.0540.0140.0160.0470.0460.0000.0610.0630.0740.0680.0320.0180.0040.0650.0100.0140.0400.0580.0050.0020.0230.0310.0000.0160.7411.0000.0500.0220.0050.0000.0070.0000.0000.0000.0730.051
has_secondary_use_hotel0.0000.0290.0360.0190.1530.0000.1630.2070.0810.0150.2620.2460.2600.2710.2140.0330.0130.1620.0120.0720.0290.1390.0300.0350.1690.1490.0000.0530.5250.0501.0000.0150.0000.0000.0020.0000.0000.0000.0000.112
has_secondary_use_rental0.0050.0390.0340.0110.0640.0000.0980.1090.0860.0050.1990.1950.1620.1930.0660.0250.0000.1150.0140.0380.0100.0980.0270.0190.1150.1300.0000.0060.2440.0220.0151.0000.0000.0000.0000.0000.0000.0000.0000.095
has_secondary_use_institution0.0000.0000.0090.0070.0440.0000.0750.0660.0420.0050.0730.0730.0590.0830.0100.0000.0040.0330.0000.0170.0000.0240.0000.0000.0480.0390.0000.0000.0840.0050.0000.0001.0000.0000.0000.0000.0000.0000.0000.041
has_secondary_use_school0.0050.0000.0160.0040.0090.0000.0690.0230.0180.0000.0400.0240.0490.0450.0060.0270.0000.0290.0000.0080.0000.0150.0000.0000.0190.0170.0000.0170.0530.0000.0000.0000.0001.0000.0000.0000.0000.0000.0000.007
has_secondary_use_industry0.0000.0090.0000.0090.0430.0000.0090.0460.0160.0040.0220.0210.0380.0280.0270.0000.0000.0290.0020.0000.0140.0290.0000.0000.0070.0190.0000.0000.0970.0070.0020.0000.0000.0001.0000.0000.0000.0000.0000.016
has_secondary_use_health_post0.0020.0170.0000.0000.0270.0000.0130.0590.0090.0040.0220.0260.0230.0310.0070.0000.0000.0060.0000.0000.0000.0000.0070.0000.0040.0110.0000.0000.0430.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.019
has_secondary_use_gov_office0.0000.0000.0000.0000.0330.0000.0200.0620.0720.0000.0370.0290.0220.0340.0000.0000.0000.0100.0000.0000.0000.0000.0000.0000.0020.0290.0000.0000.0170.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.014
has_secondary_use_use_police0.0010.0000.0000.0190.0180.0000.0000.0160.0160.0000.0200.0170.0090.0000.0000.0480.0000.0000.0000.0000.0000.0030.0000.0000.0000.0160.0000.0000.0320.0000.0000.0000.0000.0000.0000.0000.0001.0000.0000.005
has_secondary_use_other0.0000.0320.0250.0160.0200.0090.0230.0480.0360.0190.0200.0160.0410.0240.0140.0000.0050.0000.0080.0140.0000.0050.0050.0010.0030.0080.0040.0180.1990.0730.0000.0000.0000.0000.0000.0000.0000.0001.0000.010
damage_grade0.0130.2690.0680.0210.1510.0200.1050.0970.0630.0240.3100.2440.2640.2480.0480.0570.0740.3360.0620.0620.0640.2780.0730.0690.1920.2370.0350.0670.0720.0510.1120.0950.0410.0070.0160.0190.0140.0050.0101.000

Missing values

2023-05-13T14:44:31.628635image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2023-05-13T14:44:31.910088image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
080290664871219823065trnfqtd11000000000v1000000000003
1288308900281221087ornxqsd01000000000v1000000000002
29494721363897321055trnfxtd01000000000v1000000000003
3590882224181069421065trnfxsd01000011000v1000000000002
420194411131148833089trnfxsd10000000000v1000000000003
53330208558608921095trnfqsd01000000000v1110000000002
672845194751206622534nrnxqsd01000000000v1000000000003
747551520323122362086twqvxsu00000110000v1000000000001
84411260757721921586trqfqsd01000010000v1000000000002
99895002688699410134tinvjsd00000100000v1000000000001
building_idgeo_level_1_idgeo_level_2_idgeo_level_3_idcount_floors_pre_eqagearea_percentageheight_percentageland_surface_conditionfoundation_typeroof_typeground_floor_typeother_floor_typepositionplan_configurationhas_superstructure_adobe_mudhas_superstructure_mud_mortar_stonehas_superstructure_stone_flaghas_superstructure_cement_mortar_stonehas_superstructure_mud_mortar_brickhas_superstructure_cement_mortar_brickhas_superstructure_timberhas_superstructure_bamboohas_superstructure_rc_non_engineeredhas_superstructure_rc_engineeredhas_superstructure_otherlegal_ownership_statuscount_familieshas_secondary_usehas_secondary_use_agriculturehas_secondary_use_hotelhas_secondary_use_rentalhas_secondary_use_institutionhas_secondary_use_schoolhas_secondary_use_industryhas_secondary_use_health_posthas_secondary_use_gov_officehas_secondary_use_use_policehas_secondary_use_otherdamage_grade
4999097375426113266451065twnfjsd00000010000a1000000000001
4999193383421937395333567trqfxsd01000010000v2000000000003
499926537532281642702584trnfqsd01000000000v1000000000003
4999338552211128383033576nrnfqsd01000010000v1000000000003
4999459165612891002123575trnfqtd01000000000v1110000000002
4999546837322835676021565trnfxsd01000000000v1000000000002
499965329508597635122056trnfqtd01000000000v1110000000002
499978254027129485722094trnfqsd01000010000v1000000000003
4999882496411765108142536tunxqtd00001000000v1000000000002
499992838242049247212594nwnfqsd01000010000a1000000000002